Add TelemetryConfigRef support to MCPRemoteProxy#4719
Conversation
Brings MCPRemoteProxy to parity with MCPServer's telemetry API by adding a TelemetryConfigRef field for referencing shared MCPTelemetryConfig resources. Includes CEL mutual exclusivity validation with the deprecated inline Telemetry field, TelemetryConfigHash in status for change detection, and condition type/reason constants. Part of #4620 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Namespace-scoped fetch for MCPTelemetryConfig referenced by an MCPRemoteProxy. Returns (nil, nil) when the ref is nil or the resource is not found, matching the MCPServer getTelemetryConfigForMCPServer contract. Part of #4620 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…roxy Wire up the reconciler to validate referenced MCPTelemetryConfig resources, track config hashes in status, and reconcile when the underlying MCPTelemetryConfig changes. Follows the same handler pattern as handleToolConfig and the MCPServer telemetry handler. Part of #4620 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…Config When building the RunConfig, resolve telemetry from the referenced MCPTelemetryConfig first and fall back to the deprecated inline Telemetry field. Adds ctx parameter to createRunConfigFromMCPRemoteProxy to support the API fetch, matching the MCPServer runconfig pattern. Part of #4620 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Auto-generated from MCPRemoteProxy type changes: adds telemetryConfigRef to CRD schema with CEL mutual exclusivity validation, telemetryConfigHash to status, and updated API reference docs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract resolveToolConfig and addTelemetryOptions from createRunConfigFromMCPRemoteProxy to bring cyclomatic complexity below the threshold. Rename shadowed ctx to apiCtx for clarity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4719 +/- ##
==========================================
+ Coverage 68.71% 68.94% +0.23%
==========================================
Files 517 518 +1
Lines 54817 54921 +104
==========================================
+ Hits 37666 37868 +202
+ Misses 14252 14142 -110
- Partials 2899 2911 +12 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
jhrozek
left a comment
There was a problem hiding this comment.
Code review from automated analysis. Five findings — three require changes in files not touched by this PR (prose below), two have inline comments on the diff.
[HIGH] Missing +kubebuilder:rbac annotation for mcptelemetryconfigs
File: cmd/thv-operator/controllers/mcpremoteproxy_controller.go (around line 48–51)
The MCPRemoteProxyReconciler now lists and gets MCPTelemetryConfig objects (in mapTelemetryConfigToMCPRemoteProxy and via GetTelemetryConfigForMCPRemoteProxy), but there is no +kubebuilder:rbac marker for mcptelemetryconfigs. task operator-manifests generates the ClusterRole from these markers — without this one, the controller gets permission-denied errors at runtime the first time any proxy uses TelemetryConfigRef.
MCPServerReconciler already has:
// +kubebuilder:rbac:groups=toolhive.stacklok.dev,resources=mcptelemetryconfigs,verbs=get;list;watchAdd the same line to MCPRemoteProxyReconciler alongside the existing mcptoolconfigs/mcpoidcconfigs markers.
[HIGH] MCPTelemetryConfigReconciler.findReferencingWorkloads does not scan MCPRemoteProxy
File: cmd/thv-operator/controllers/mcptelemetryconfig_controller.go line 259 (not changed in this PR)
findReferencingWorkloads only calls FindWorkloadRefsFromMCPServers. It never looks at MCPRemoteProxy, so:
- Deletion protection is broken: an operator can delete an
MCPTelemetryConfigreferenced only by a proxy — the finalizer check passes. status.referencingWorkloadswill never include proxy names.- The reverse-direction watch (proxy changes → re-evaluate
MCPTelemetryConfig) is also missing fromSetupWithManager.
MCPOIDCConfigReconciler and MCPExternalAuthConfigReconciler were both extended to include MCPRemoteProxy when those ref fields were added. The same pattern needs to be applied here.
[HIGH] Missing CA bundle volume mount for TelemetryConfigRef in proxy deployment
File: cmd/thv-operator/controllers/mcpremoteproxy_deployment.go (buildVolumesForProxy, not changed in this PR)
addTelemetryOptions correctly calls ctrlutil.TelemetryCABundleFilePath(telCfg) and embeds the path in the RunConfig. At pod startup the proxy runner opens that path via os.ReadFile to build a custom TLS config for the OTLP collector.
However, buildVolumesForProxy never calls ctrlutil.AddTelemetryCABundleVolumes. The ConfigMap is never mounted into the pod. Any user who sets spec.openTelemetry.caBundleRef in their MCPTelemetryConfig will get a file-not-found error at proxy startup.
MCPServerReconciler.deploymentForMCPServer already handles this correctly — the same block needs to be added to deploymentForMCPRemoteProxy.
cmd/thv-operator/controllers/mcpremoteproxy_telemetryconfig_test.go
Outdated
Show resolved
Hide resolved
cmd/thv-operator/controllers/mcpremoteproxy_telemetryconfig_test.go
Outdated
Show resolved
Hide resolved
Five fixes from code review: 1. Add +kubebuilder:rbac marker for mcptelemetryconfigs on MCPRemoteProxyReconciler — required for runtime API access 2. Extend MCPTelemetryConfigReconciler.findReferencingWorkloads to scan MCPRemoteProxy in addition to MCPServer, fixing deletion protection and referencingWorkloads status. Add MCPRemoteProxy watch to SetupWithManager for the reverse-direction reconciliation 3. Mount telemetry CA bundle volumes in proxy deployment via addTelemetryCABundleVolumes — without this, caBundleRef users get file-not-found at proxy startup 4. Fix condition removal not persisted when TelemetryConfigHash is already empty — check for stale condition before deciding whether to call Status().Update() 5. Fix test assertions to re-fetch persisted state from fakeClient instead of reading in-memory pointers. Add recovery-from-False and stale-condition-removal test cases Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Thanks for the thorough review @jhrozek — all five findings are addressed in 09a948e. [HIGH] Missing
|
Two new integration tests in the mcp-telemetry-config suite: 1. MCPRemoteProxy tracked in ReferencingWorkloads — creates a proxy referencing an MCPTelemetryConfig and asserts the proxy appears in status.referencingWorkloads via the MCPRemoteProxy watch handler. 2. Deletion protection for proxy-only references — verifies that the finalizer blocks deletion while an MCPRemoteProxy references the config, and allows deletion after the proxy is removed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
jhrozek
left a comment
There was a problem hiding this comment.
All five review findings have been addressed in the latest push:
- RBAC annotation —
+kubebuilder:rbacformcptelemetryconfigsadded toMCPRemoteProxyReconciler - findReferencingWorkloads — now scans
MCPRemoteProxyviaFindReferencingMCPRemoteProxies, plusWatchesand mapper added toMCPTelemetryConfigReconciler.SetupWithManager - CA bundle volume mount — new
addTelemetryCABundleVolumesmethod wires up the ConfigMap volume in the proxy deployment - Test assertions — tests now re-fetch persisted state via
fakeClient.Getinstead of asserting on in-memory pointers - nil-ref condition persistence —
condRemoved || hash != ""guard ensures the status update fires even when the hash was already empty
New recovery-from-False and stale-condition-removal test cases also added. Integration tests extended for the telemetry config controller. Looks good.
Summary
MCPRemoteProxy lacked the ability to reference shared
MCPTelemetryConfigresources, forcing users to duplicate inline telemetry configuration across every proxy instance. This brings MCPRemoteProxy to parity with MCPServer's telemetry API by addingTelemetryConfigRefsupport.Closes #4620
Type of change
Changes
cmd/thv-operator/api/v1alpha1/mcpremoteproxy_types.goTelemetryConfigReffield with CEL mutual exclusivity validation, deprecate inlineTelemetry, addTelemetryConfigHashstatus field and condition constantscmd/thv-operator/pkg/controllerutil/config.goGetTelemetryConfigForMCPRemoteProxy()helper — namespace-scoped fetch returning (nil, nil) for not-foundcmd/thv-operator/controllers/mcpremoteproxy_controller.gohandleTelemetryConfig()handler, call invalidateAndHandleConfigs(), addmapTelemetryConfigToMCPRemoteProxy()watch mapper, registerMCPTelemetryConfigwatch inSetupWithManager()cmd/thv-operator/controllers/mcpremoteproxy_runconfig.goTelemetryConfigRefover inlineTelemetryin RunConfig; extractaddTelemetryOptionsandresolveToolConfighelpers to fix cyclomatic complexityTest plan
task lint-fixpasses (0 issues)task testpasses (all operator unit tests)go build ./...passesTestGetTelemetryConfigForMCPRemoteProxy(4 cases),TestHandleTelemetryConfig_MCPRemoteProxy(5 cases),TestMapTelemetryConfigToMCPRemoteProxyTestCreateRunConfigFromMCPRemoteProxytests still pass with updatedctxparameterSpecial notes for reviewers
resolveToolConfigextraction inmcpremoteproxy_runconfig.gois a refactor needed to keep cyclomatic complexity under the lint threshold (was already at 15 pre-change). No behavioral change.Generated with Claude Code